An instruction-level energy model for embedded VLIW architectures

نویسندگان

  • Mariagiovanna Sami
  • Donatella Sciuto
  • Cristina Silvano
  • Vittorio Zaccaria
چکیده

In this paper, an instruction-level energy model is proposed for the data-path of very long instruction word (VLIW) pipelined processors that can be used to provide accurate power consumption information during either an instruction-level simulation or power-oriented scheduling at compile time. The analytical model takes into account several software-level parameters (such as instruction ordering, pipeline stall probability, and instruction cache miss probability) as well as microarchitectural-level ones (such as pipeline stage power consumption per instruction) providing an efficient pipeline-aware instruction-level power estimation, whose accuracy is very close to those given by RT or gate-level simulations. The problem of instruction-level power characterization of a -issue VLIW processor is ( 2 ) where is the number of operations in the ISA and is the number of parallel instructions composing the very long instruction. One of the advantages of the proposed model consists of reducing the complexity of the characterization problem to ( ). The proposed model has been used to characterize a four-issue VLIW core with a six-stage pipeline, and its accuracy and efficiency has been compared with respect to energy estimates derived by gate-level simulation. Experimental results (carried out on a set of embedded DSP benchmarks) have demonstrated an average error in accuracy of 4.8% of the instruction-level estimation engine with respect to the gate-level engine. The average simulation speed-up of the instruction-level power estimation engine with respect to the gate-level engine is of four orders of magnitude approximately.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Intermediate Representation for the Exploitation of Instruction Level Parallelism in Embedded Synthesis and VLIW Compilation Environments

This paper introduces the All-Pairs Common Slack Graph (APCSG), an intermediate representation of the instruction level parallelism that exists within a computation. The APCSG is intended for use in high level synthesis systems and compilers that target VLIW architectures. To exploit the benefits of the APCSG, we have developed the Parallel Template Generation Algorithm, a general purpose frame...

متن کامل

Reducing the complexity of instruction-level power models for VLIW processors

Aim of this paper is to propose a high-level power exploration framework based on an instruction-level energy model for VLIW (Very Long Instruction Word) architectures. More specifically, the present paper deals with the reduction of the complexity of the energy model of K -issue VLIW processors from exponential with respect to the number of operations within the Instruction Set O(|I S A|K ) to...

متن کامل

Software Thread Integration for Converting Tlp to Ilp on Vliw/epic Architectures

SO, WON. Software Thread Integration for Converting TLP to ILP on VLIW/EPIC Architectures. (Under the direction of Alexander G. Dean.) Multimedia applications are pervasive in modern systems. They generally require a significantly higher level of performance than previous workloads of embedded systems. They have driven digital signal processor makers to adopt high-performance architectures like...

متن کامل

Exploring Energy-Performance Trade-Offs for Heterogeneous Interconnect Clustered VLIW Processors

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving clock speed, reducing energy consumption of the logic, and making design simpler, it introduces extra overheads by way of inter-cluster communication. This communication ...

متن کامل

Global Trade-o between Code Size and Performance for Loop Unrolling on VLIW Architectures

Many media processors 28, 7, 14, 8, 18, 27], used for computing intensive embedded applications, are VLIW architectures that rely on the compiler to exploit Instruction Level Parallelism. Loop unrolling is generally used to expose instruction parallelism but computing the unrolling factor is very diicult as instruction cache misses and spill code can cancel the expected beneet of the transforma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. on CAD of Integrated Circuits and Systems

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2002